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Abstract 

We suggest a method for holding a dictionary data structure, which 
maps keys to values, in the spirit of Bloom Filters. The space require- 
ments of the dictionary we suggest are much smaller than those of a 
hashtable. We allow storing n keys, each mapped to value which is a 
string of k bits. Our suggested method requires nk + o(n) bits space 
to store the dictionary, and 0{n) time to produce the data structure, 
and allows answering a membership query in 0(1) memory probes. 
The dictionary size does not depend on the size of the keys. However, 
reducing the space requirements of the data structure comes at a cer- 
tain cost. Our dictionary has a small probability of a one sided error. 
When attempting to obtain the value for a key that is stored in the 
dictionary we always get the correct answer. However, when testing 
for membership of an element that is not stored in the dictionary, we 
may get an incorrect answer, and when requesting the value of such 
an element we may get a certain random value. Our method is based 
on solving equations in GF{2'') and using several hash functions. 

Another significant advantage of our suggested method is that we 
do not require using sophisticated hash functions. We only require 
pairwise independent hash functions. We also suggest a data structure 
that requires only nk bits space, has O(n^) preprocessing time, and 
has a O(logn) query time. However, this data structures requires a 
uniform hash functions. 

In order replace a Bloom Filter of n elements with an error proa- 
bility of 2~'', we require nk + o{n) memory bits, 0(1) query time, 
0(n) preprocessing time, and only pairwise independent hash func- 
tion. Even the most advanced previously known Bloom Filter would 
require nk + 0{n) space, and a uniform hash functions, so our method 
is significantly less space consuming especially when k is small. 

Our suggested dictionary can replace Bloom Filters, and has many 
applications. A few application examples are dictionaries for storing 
bad passwords, differential files in databases, Internet caching and dis- 
tributed storage systems. 
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1 Introduction 



A Bloom Filter is a very basic data structure which, given a set of n elements, 
allows us to quickly decide whether a given element is in the set or not. The 
main advantage of Bloom Filters is that they are very memory efficient 
— a Bloom Filter only requires space linear in the number of elements in 
the set, while other data structures use memory linear in the size of the 
represented elements in the set. When the elements stored in the set do 
not have a succinct representation, this is a very significant advantage. For 
example, consider strings, with average size of 800 bits. A hashtable for 
storing 100,000,000 such strings would require at least 800*100,000,000 bits, 
so a hard disk must be used for the table, and lookups would be rather slow. 
A basic Bloom Filter based structure would only require 145,000,000 bits, 
which can easily be stored in the main memory. On the other hand, the 
Bloom Filter achieves this at a certain cost. A Bloom Filter has a certain 
probability of returning a wrong answer. The error is one sided: if the key is 
in the set, the Bloom Filter will always return the correct answer, but if the 
key is not in the set, it might return a wrong answer. However, for many 
applications, it is possible to overcome this problem, and still gain from the 
low space requirements of the Bloom Filter. 

The main use of the Bloom Filter is to reduce the memory that the data 
structure uses. The basic Bloom Filter [1] (invented in 1970) used nloge 
memory bits and returned the answer using a single probe to memory, with 
error probability of ^ (for a false positive). One way to reduce the error 
probability is to run the basic Bloom Filter k times, therefore it would 
require nkloge memory bits and k memory probes in order to answer a 
query. 

During the past few years, several papers have been published on Bloom 
Filter [3l [lOl [H [H [13] . Most of which provided methods for reducing the 
memory and the number of probes required, but only considered the case 
where k is big enough. One more disadvantage of these newer methods is 
that they do not allow "insertion" operations, which were possible to perform 
using the original Bloom Filter technique. Yet another disadvantage of these 
newer methods is that they require universal hash functions. Such functions 
are computationally inefficient, or have large memory requirements. 

In this paper we provide a new data structure that can replace Bloom 
Filters, and has lower space requirements. Our data structure requires nk + 
o{n) memory bits (which is optimal up to o(n)), and each query takes 0(1) 
memory probes. However, like most of the other Bloom Filter replacements, 
our data structure is static and does not support insertions. Building our 
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data structure requires 0{n) preprocessing time and 0{n) memory. This 
data structure is based on solving equations, and uses hash functions. We 
only require hash functions that are pairwise independent. 

In addition, we suggest a similar data structure that requires only nk 
memory bits, O(logn) query time, and 0{'n?) preprocessing time. However, 
this data structure requires uniform hash functions. 

1.1 Applications of Bloom Filters 

Bloom Filters, as well as Bloom Filter replacements such as the one we 
suggest, have many applications. A good survey of Bloom Filter uses can 
be found in [2]. A few examples are given below. 

Dictionaries: Early versions of UNIX's spell checker used a Bloom Filter 
of the dictionary instead of the dictionary itself. This Bloom Filter left sev- 
eral words misspelled, but the memory in these days was valuable resource 
and the memory it save was worth it [HI |Tl] . 

The Bloom Filter was proposed as a method to succinctly store a dictio- 
nary of unsuitable passwords for security purposes by Spafford [14] . Manber 
and Wu describe a simple way to extend the technique so that passwords 
that are within edit distance 1 of the dictionary word are also not allowed [8] . 
In this setting, a false positive could force a user to avoid a password even 
if it is not really in the set of unsuitable passwords. 

Databases: Bloom Filters can also be used for differential files [3 [T2] . 
Suppose that all the changes to a database that occur during the day are 
stored in a differential file and are updated back to the database only at 
the end of a day. During that day, every read from the database should 
first be checked in that differential file to be sure that the record read is the 
most recent. This file might be large, so reading through it can be slow, 
as opposed to querying a database, but still obligated. A possible solution 
to this problem is keeping a Bloom Filter of the records that have changed. 
Here, a false positive forces a read of the differential file even when a record 
has not been changed. 

Internet Cache Protocol: Fan, Cao, Almeida, and Broder describe Sum- 
mary Cache, which uses Bloom Filters for Web cache sharing [6]. In this 
setup, proxies cooperate in the following way: on a cache miss, a proxy at- 
tempts to determine if another proxy cache holds the desired Web page; if 
so, a request is made to that proxy rather than trying to obtain that page 
from the Web. For such a scheme to be effective, proxies must know the 
contents of other proxy caches. In Summary Cache, to reduce message traf- 
fic, proxies do not transfer URL lists corresponding to the exact contents of 
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their caches, but instead periodically broadcast Bloom Filters that represent 
the contents of their cache. If a proxy wishes to determine if another proxy 
has a page in its cache, it checks the appropriate Bloom Filter. In the case 
of a false positive, a proxy may request a page from another proxy, only to 
find that that proxy does not actually have that page cached. In that case, 
some additional delay has been incurred. But the load on the proxy servers 
was reduced therefore making them work faster. 

Caching for Google's Big Tables: BigTable is a distributed storage system 
for managing structured data that is designed to scale to a very large size: 
petabytes of data across thousands of commodity servers. Many projects at 
Google store data in BigTables, including web indexing, Google Earth, and 
Google Finance. These applications place very different demands on the 
BigTable, both in terms of data size (from URLs to web pages to satellite 
imagery) and latency requirements (from back end bulk processing to real- 
time data serving). Despite these varied demands, BigTable has successfully 
provided a flexible, high-performance solution for all of the above Google 
products. In some of the BigTable applications most of the queries aren't in 
the table. In BigTables Bloom Filter is used to determine whether a query 
is in the BigTable in first place, thus reducing disk accesses. A Bloom Filter 
can be also used in the client side as well to reduce the communication and 
latency. 

2 Outline 

The structure of this paper is as follows: In section[3]we define the dictionary 
data structure and give a high-level view of our method, as well as a basic 
result. In section H] we show how to improve the data structure to support 
queries in 0(1) time, and how to do the preprocessing in 0(n) time. In 
section [5] we show several methods to reduce constants hidden in these space 
complexity, which may be important in practice. In section [6] we explain 
why and how simple pairwise independents hash function are enough. In 
section [7] we show how to use the dictionary data structure in order to get 
a good Bloom Filter replacement. 

3 Dictionary Based on Matrix Solving 

Dictionaries are data structures that hold key- value pairs. This section 
describes a method for concise representation of dictionaries with one sided 
errors, in the spirit of Bloom Filters. 
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Definition 1. A one sided error dictionary (U,k,n) is a data structure 
that holds values for keys. It is a mapping from Xi,X2, ■ ■ ■ ,Xn E U to 
di,d2, . . . ,dn G {0, 1, . . . , 2^^ — 1}. Given a key xi, a dictionary allows re- 
trieving di . However, given a key x which is not one of the Xi 's it may return 
any value. 

We now show how to build a dictionary which requires a storage space 
of nk + o(n) bits. The high level concept behind our method is solving 
equations. Assume we have a fully random hash function h from U lo n 
variable equation in GF{2^) (we later show how to remove the fully random 
assumption later), i.e. h : U ^ GF(2^)". We go over all the Xj's and we 
write the equation h{xi) ■ b = d^. We get n equations with n variables. If 
these equations are independent we can solve them in 0{n^) time. This 
can be done in a one time preprocessing, after which we can store the hash 
function h and the vector b as our data structure. The vector b requires nk 
bits space. To answer a query x we apply h on x and compute h{x) ■ b and 
return the answer. If x is one of the Xj's we get the correct d^. If x is not 
one of the Xj's we might return an erroneous answer. The overall query time 
is 0{n). 

However, this process only works when we get an independent set of 
equations. We now examine the probability of obtaining such an indepen- 
dent equation set. 

Theorem 3.1. The probability that our method generates an independent 
set ofn equations onn+c variables in the field GF{2^) is at least 1— 2fcc(2^fc_i) 

Proof. We order the generated equations according to the order in which 
they are constructed. The set of the equations is dependent when there 
exist i such that equations 1, 2, . . . , i — 1 and equation i are dependent. The 
probability that equation i and equations 1, 2, . . . , i — 1 are dependent is 
at most ^2fc)n+c (the probability is even lower when there are dependent 
equations before that index). We apply the union bound and get that the 
probability that there exists an i such that the equation i and the equations 

before it are dependent is at most Y1^=q (^^yi+c < 2*"'(^^-i) 

Corollary 3.2. Even for c = we get an independent set of equations with 
constant probability. Therefore we need to run the preprocessing algorithm 
0{1) time, each time with a different hash function, in order to get an 
independent set of equations. 
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The main disadvantage of this data structure is that it requires 0{n) 
time in order to answer a query. One possible improvement can be achieved 
by using t-sparse equations. 

Definition 2. t-sparse equations are equations of the form X]r=i'^*' where 
\{ai\ai^O}\ <t. 

Using t-sparse equations the query time shrinks to t memory probes, 
0{t) time. However we need at least m = n(l + e~*~^) variables in our 
equations set in order to have a full independent equations set. 

Tiieorem 3.3. // we have n t-sparse random equations in less than m = 
n(l +e^*^'^) variables, the equations will he dependent with high probability. 

Proof. When we have n t-sparse random equations on m = n(l + e~*^'^) 
there are some variables that we do not use. Because we can look on it 
as throwing t x n balls to m cells. The expected number of empty cells is 
m(l — ^)*" ~ me^m. Therefore the expected number of variables we use 
in our equations is m(l — e m ). It m(l — e ™ ) < n, we get n equations on 
less then n variables and therefore they will not be independent. □ 

Actually if we take n(l + e~*) we will have a good probability to get 
independent set of equations. 

Note that the preprocessing of the "sparse" data structure is O(tn^), 
using the Wiedemann algorithm [16| for solving sparse linear equations. 

4 Improved Dictionary 

We now show how to reduce the query time to 0(1) memory probes. We 
also reduce the preprocessing time to 0{n). The high level idea behind 
the method suggested in this section is to divide xi, X2, ■ ■ ■ , Xn randomly to 
small buckets, and to run the same algorithm on each of the buckets. 

We can randomly hash the keys to ^ buckets using hash function hi : 
U —>■ {1,2, . . . , j}. The expected number of keys in each bucket would be 
s, and if s is big enough, with high probability there will not be a bucket 
with more then 2s keys (if there such a bucket we can choose another hash 
function hi and so on). Querying for x is done by simply applying hi{x) and 
going to the /ii(x)'th data structure. In that data structure we query for x 
as done in section [3l The hi{x) data structure does not contain more then 2s 
keys, so it would take 0{s) time to answer the query. The preprocessing is 
now performed by choosing hi and checking if there is no bucket with more 
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than 2s keys. If there is such a bucket, we choose another hash function 
hi. This is done 0(1) times. We then divide the keys xi,X2, ■ ■ ■ ,Xn to the 
buckets and run the same preprocessing method described in section [3] on 
each bucket. 

Overall it would take 0(f s^) = ©(ns^). The memory that this data 
structure consumes is nk + O(^logn) memory bits. The O(^logn) is re- 
quired in order to maintain pointers to each of the data structures. Natu- 
rally, our method works best when s is small. However, if we reduce s too 
much we we lose the fact that with high probability there is no bucket which 
is bigger then 2s, and the O(^logn) becomes significant. 

We solve this problem by using a two-level hashing. We first explain the 
preprocessing and then show how to run a query. Given xi,X2, ■ ■ ■ ,Xn we 
hash them using hi : U ^ {1,2, . . . , " }, which we now only require to 

log n 

be pairwise independent, to j^^-^ buckets. It might be the case that there 

are some buckets which more then 2 log^ n keys. We call such big buckets 
bad buckets. We choose another hi hash function only if we will get more 
then , keys hashed to bad buckets. 

log n •' 

Theorem 4.1. The probability that there are more then ^ keys hashed 
to bad buckets is at most ^ 

Proof. We denote by Bi the number of keys hashed to bucket i. Using 
Markov's inequality we get: 

Pr [Bucket i is bad] = Pr \Bi > 21og'^nl = Pr \Bi > 2log^ E{Bi)] < 3_ 

2 log n 

Denote by Xj the event that Xi is hashed to a bad bucket, and hy X = 
Yl^=i the number of keys hashed to a bad bucket. Pr [Xi = 1] < ^ ^J^^ ^ 

therefore EiX) < Pr\x > < Pr[X>2EiX)] < \ by 
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markov inequality. □ 

Corollary 4.2. It takes 0{n) time to find a hash function hi that we can 
use for the rest of the procedure. 

After we find a good hash function hi , we deal with all the keys that are 
hashed to a bad bucket using a regular dictionary data structure. It takes 
at most 0{j^^) = o{n) bits (we can easily modify it to take 0( ^^^e„ ) bits 
for any constant c). 

Denote by Bi the number of keys hashed by hi to bucket i. Each good 
bucket i (such that Bi < 2 log^ n) is splitted again to sub-buckets using 
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h2,i '■ U {1, 2, . . . , ^ ^' } (we now assume that h2,i is fuhy random, in 

2 V fe 

section [6] we show how to relax this assumption) . If we get a sub-bucket 
which is bigger then \J^^^P^ we choose another h2,i- 

Theorem 4.3. When we split a bucket to a sub-buckets the probability that 
there exist sub-bucket which more then \J^^^ keys hashed to it is at most ^ 



Proof. The expected number of keys hashed to a sub-bucket is 

Using Chernoff 's inequality we get that the probability for each sub-bucket 

to have more then A/^^rnr- is much smaller then , 4 . Using the union 
bound we get that the probability that there exist a sub-bucket with more 

then Y^^^^ is smaller then ^, since we have less then " sub-buckets. □ 

Corollary 4.4. It takes 0{Bi) time to find such an h2,i- Overall, finding a 
hash function h2^i for all i's requires 0{n) time. 

We now have many smaller dictionary sub-problems. Each one of them 

has a size of less then \ . We solve each one of them using the method 
mentioned in section [3l For each sub problem we get a random matrix of 

size bounded by v/^^^ x v/^^^ over GF{2^). The number of different such 
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matrices is at most 2 v 2fc > = Thus we can list all the different 

matrices and solve them in advance in time 0{^/nlog^'^ n), and the list 
would require O(y^logn) memory bits. 

Thus the preprocessing takes 0(n) time, since we can solve each sub- 
problem by simply looking in the list. 

We store the data structure as follows. We store all the keys which 
map to bad buckets using a regular dictionary, with o(n) memory bits. We 
store a big array of less then n words, each consisting of k bits which are 
the concatenation of all the sub-buckets in all the buckets. We also store 
a select data structure which gives us the ability to jump in 0(1) memory 
probes to each of the buckets and sub-buckets. It requires o(n) memory bits 
as well. Finally, we store all the hash functions. In section [6] we show how 
they can be stored. Overall we use nk -|- o(n) memory bits. 

To answer a query we simply use hi in order to see to which of the 
bucket we need to go. If it is a bad bucket, we look for the query in the 
regular dictionary data structure. Otherwise we use /i2,i in order to find in 
which sub-bucket the query falls. All the operation up to this point take 
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0(1) time, and we use one probe to the memory to retrieve /i2,i. We use the 
dictionary data structure of the sub-bucket in order to answer the query. It 

takes 0(1) probes to the memory (we retrieve \J'^^^ bits in these probes, 

and in the last probe we take a word), but it takes 0{\J'^^) time to retrieve 
the answer. In order to reduce that time to 0(1) we have two options: we 
can either use sparse equations or we can construct a table holding all the 
answers to all of the possible equations on all of the possibles assignments, 
and answer the query in 0(1) time by probing a table for getting the answeill]. 



5 Practical Improvements 

We now examine a few practical improvements for our method. 

Sparse equations: Whenever we use the solution of section [3] (even 
inside the sub-buckets) we can use Inn sparse equations set (in the sub- 
bucket case it is In log n). This still works fine even when we use only n 
variables, therefore it requires nk + o{n) memory bits. Note that this will 
not work if we take only even number of variables per equation. 

Another sparse equations improvement is to create equations which will 
be more or less local i.e. the {i\ai ^ 0} will be close to each other. This 
way need less memory probes, because in each memory probe we can get 
O(logn) continues bits. 

Another counting argument: If we make each sub-bucket bigger, we can 
gain in the o(n) overhead. Denote by s the maximum number of keys hashed 
to a sub-bucket. For each such sub-bucket (from section H]). In section H] 
we had a certain preprocessing analysis. We now give an alternative one. 
In each sub-bucket we hash keys to {1, 2, . . . , s^}. With probability of at 
least ^ there will not be any collision in this hash. If we do have a collision 
we choose another hash function. On average, 2 bits are required to store 
which hash function we use in each sub-bucket. We now have a list of at 
most s keys from the universe {1, 2, . . . , s^}, where each key gets a value 
in GF{2^). Note that if we have the same set of keys in two different 
sub-buckets, we can use the same set of equations even if they do not get 
the same values — being a full rank equations set does not depand on the 
values (the free vector). Thus, the number of different sets of equations 
we use is (^^ ) . For s < 2 loglog n Set o{^/n) different equations sets. 
For each of the equations sets we compute the inverse and store it in a 

^We can play a little more with the size of each sub-bucket in order to do this in o(n) 
space 
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hashtable. The naive way to perform the preprocessing using this technique 
takes Y^f^i^~^^'^^^^ 0{sub — bucket — size^) = 0{n ^^^f^^^ ) time, because 
we need to multiply the inverse matrix by the data for each sub-bucket. 
However we can collect O(logn) sub-buckets that map to the same matrix 
(inverse matrix) and multiply the same matrix by O(logn) different values 
vectors. We get O(logn) speed up in time using word operations. Therefore 
the preprocessing running time shrinks back to 0{n). Making the equations 
O(lnlogn) sparse and local we get 0(1) query time as weljl. 

A real nk solution: We can get rid of the extra o(n), by solving n 
equations in n variables. Each equation will be Inn sparse equation. The 
preprocessing time takes 0{'n?) using the block Wiedemann algorithm }15j, 
and the query takes O(logn) time. Note that we need to use a uniform hash 
function for this result. 



6 Using simple hash functions 

We only assume a truly random hash function inside the buckets. Each 
bucket consist of at most log^ n keys. Therefore we can construct hash 
function by simply using array R of log^ n random numbers and a pairwise 
independent hash function h : U ^ {0, 1, . . . , log^ n}. The result for the 
new hash function is R[h{x)]. Given that we hash at most log^n keys. The 
probability that there exist two keys that use the same random number is less 
then ^. Therefore we got a random enough hash function with probability 
^. If we store 2 log hash functions like this, with probability bigger then 
1 — - each bucket will have at least one hash function which will satisfied 

n 

it. The only extra space required is 0(log n) memory bits. 

7 Membership Queries 

We first define a membership data structure. 

Definition 3. A Membership data structure (n,k) for xi,x^ . . . ,Xn ^ U is a 
data structure that allows answering membership queries. Given a query x 
where x is one of the Xi's, the data structure always returns 1, and given a 
query x where x is not one of the Xi it returns with probability of at least 

^ using tables as well 
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We can easily build a membership data structure given a dictionary data 
structure. We simply choose random pairwise independent hash function 
/i : [/ ^ {0, 1,... ,2^ - 1} and we store a dictionary that map Xi to h{xi). 

In order to check if x is in the data structure we simply query x from 
the dictionary data structure and check if it's value equal to h{x). If x is in 
the data structure it will always return 1. 

Theorem 7.1. If x isn't in the data structure we will return 1 with proba- 
bility 2^k. 

Proof. We choose the hash function independent from the dictionary data 
structure. Therefore the answer of the query x from the dictionary data 
structure, if x isn't a member is a fc-bit string which is independent to h{x). 
Then the probability that they are equal is 2"'^ because h{x) is random. □ 

8 Conclusions and Open problems 

We have suggested a new data structure that can replace Bloom Filters. This 
data structure allows maintaining a dictionary mapping keys to values, and 
allows retrieving the value for a key with a one sided error. Our method has 
significant advantages over Bloom Filter and other previously know Bloom 
Filter replacements. It uses only nk + o{n) memory bits (which is optimal up 
to o{n)), and each query takes 0(1) memory probes. Also, we only require 
pairwise independent hash function. 

We have also suggested a similar data structure, that has an even lower 
space requirement, of only nk memory bits. However, it has a O(logn) 
query time and requires O(n^) preprocessing time. Also, this data structure 
requires uniform hash functions. 

Despite its advantages, the method we suggest, like several other Bloom 
Filter replacements, does not allow "insertion" operations, which the original 
Bloom Filter technique does support. 

We believe the preprocessing phase of our algorithm can be distributed 
easily. In fact, we believe it should be distributed in most applications, due 
to the memory it consumes. 

There are several directions open for future research. First, it will be 
interesting to see if it is possible to design a data structure which only 
requires one pass on the input elements and with small additional memory. 
Also, it may be possible to develope a fully dynamic data structure, with 
space requirements lower than those of the traditional Bloom Filter. 



11 



References 

[1] Burton H. Bloom. Space/time trade-offs in hash coding with aUowable 
errors. Commun. ACM, 13(7):422-426, 1970. 

[2] A. Broder and M. Mitzenmacher. Network apphcations of bloom filters: 
A survey, 2002. 

[3] Andrej Brodnik and J. Ian Munro. Membership in constant time and 
almost-minimum space. SIAM J. Comput, 28(5): 1627-1640, 1999. 

[4] Bernard Chazelle, Joe Kilian, Ronitt Rubinfeld, and Ayellet Tal. The 
bloomier filter: an efficient data structure for static support lookup ta- 
bles. In SODA '04: Proceedings of the fifteenth annual ACM-SIAM 

symposium on Discrete algorithms, pages 30-39, Philadelphia, PA, 
USA, 2004. Society for Industrial and Applied Mathematics. 

[5] Saar Cohen and Yossi Matias. Spectral bloom filters. In SIGMOD 
'03: Proceedings of the 2003 ACM SICMOD international conference 
on Management of data, pages 241-252, New York, NY, USA, 2003. 
ACM. 

[6] Li Fan, Pei Cao, Jussara Almeida, and Andrei Z. Broder. Summary 
cache: a scalable wide-area web cache sharing protocol. IEEE/ACM 
Trans. Netw., 8(3):281-293, 2000. 

[7] Lee L. Gremillion. Designing a bloom filter for differential file access. 
Commun. ACM, 25(9):600-604, 1982. 

[8] Udi Manber and Sun Wu. An algorithm for approximate member- 
ship checking with application to password security. Inf. Process. Lett., 
50(4):191-197, 1994. 

[9] M. D. Mcllroy. Development of a spelling list. IEEE Transactions on 
Communications, 30(l):91-99, 1982. 

[10] Michael Mitzenmacher. Compressed bloom filters. IEEE/ACM Trans. 
Netw., 10(5):604-612, 2002. 

[11] J. K. Mullin and D. J. Margoliash. A tale of three spelling checkers. 
Softw. Pract. Exper., 20(6):625-630, 1990. 

[12] James K. Mullin. A second look at bloom filters. Commun. ACM, 
26(8):570-571, 1983. 



12 



[13] Anna Pagh, Rasmus Pagh, and S. Srinivasa Rao. An optimal bloom 

filter replacement. In SODA '05: Proceedings of the sixteenth an- 
nual ACM-SIAM symposium on Discrete algorithms, pages 823-829, 
Philadelphia, PA, USA, 2005. Society for Industrial and Applied Math- 
ematics. 

[14] E. H. Spafford. Opus: Preventing weak password choices. Computer 
and Security, 10:273-278, 1992. 

[15] G. Villard. A study of coppersmith's block Wiedemann algorithm using 
matrix polynomials. 

[16] D. H. Wiedemann. Solving sparse linear equations over finite fields. 
IEEE Trans. Information Theory, IT-32(l):54-62, 1986. 



13 



